Data Visualization with R

require(ggplot2)
require(ggthemes)
require(xtable)
require(qdata)
data(bands)

The overall appearance of plots is controlled by the theming system, an important component of ggplot2 grammar. We learned about the theming system in the previous chapters, in particular in “Legend Customization”, “Axes Customization” and “Facet Customization”, where we saw how themes give us the control of non-data elements of the plot like fonts, ticks, panel strip, legend keys, ..

Let us deeepen the theming system structure.

ggplot2 has a default theme, called theme_grey(), with a light grey background and white gridlines.
A theme, or theme function defines the settings of a collection of theme elements for the purpose of creating a specific style of graphics production.

Let us see an example, considering the relationship between humidity and viscosity by pressure type in bands dataset:

pl <- ggplot(data = bands, mapping = aes(x= humidity, y = viscosity, colour =press_type)) +
  geom_point()
pl

In particular, a theme function is composed of:

  • theme elements, which refer to individual attributes of a graphic that are independent of the data and that you can control, such as font size, axis ticks, appearance of grid lines or background color of a legend;

  • theme element functions, which enables you to modify the settings of certain theme elements. In particular, each theme element is associated with an element function, which describes the visual properties of that element.

Each theme element has a default value that can be locally modified in a specific ggplot object by using a theme() function.

Suppose we want to modify the colour of axis lines:

pl + 
  theme(axis.line.x = element_line(colour = "black"),
        axis.line.y = element_line(colour = "black"))

In particular, axis.line.x and axis.line.y are the theme elements that control the appearance of x and y axis respectively and element_line() is the theme element function that allows us to modify the theme elements.

Moreover, if you don’ like theme_grey() you can decide to totally replace it by setting another theme function (complete themes). Let us set theme_bw(), a theme with white background and thin grey grid lines:

pl + 
  theme_bw()

Understanding how dealing with theme element functions and theme elements is very important in plot customization phase.

The most important theme element functions are:

  • element_text(): controls the drawing of labels and headings.
    The table below lists its arguments and their corresponding default values:
Argument Description Default Value
family font family ’’
face font face ‘plain’
colour font color ‘black’
size font size (pts) 10
hjust horizontal justification 0.5
vjust vertical justification 0.5
angle text angle 0
lineheight line height 1.1
  • element_line(): draws lines and segments such as graphics region boundaries, axis tick marks and grid lines.
    The table below lists its arguments and their corresponding default values:
Argument Description Default Value
colour line color ‘black’
size line thickness 0.5
linetype type of line 1
  • element_rect(): draws rectangles. It is mostly used for background elements and legend keys.
    The table below lists its arguments and their corresponding default values:
Argument Description Default Value
fill fill color NA (none)
colour border color ‘black’
size thickness of border line 0.5
linetype type of border line 1 (solid)
  • element_blank(): draws nothing. This function has no arguments.

There are around 40 unique theme elements that controls the appearance of the plots. They can be roughly grouped into five categories: plot, axis, legend, panel and facet.

Let us schematize them:

  • plot theme elements
Element Setter Description
plot.background element_rect() plot background
plot.title element_text() plot title
plot.margin unit() margins around plot

Let us see an example:

pl + 
  labs(title = "Plot title") +
  theme(plot.title = element_text(size = 20, vjust = 2),
        plot.background = element_rect(
          fill = "lightblue",
          colour = "black",
          size = 2,
          linetype = "longdash"),
        plot.margin = unit(c(1, 1, 1, 1), "in"))

  • axis theme elements
Element Setter Description
axis.line element_line() line parallel to axis (hidden in default theme)
axis.text element_text() tick labels
axis.text.x element_text() x-axis tick labels
axis.text.y element_text() y-axis tick labels
axis.title element_text() axis titles
axis.title.x element_text() x-axis title
axis.title.y element_text() y-axis title
axis.ticks element_line() axis tick marks
axis.ticks.length unit() length of tick marks
axis.ticks.margin unit() width of axis tick margin

Let us see an example:

pl + 
  theme(
    axis.line.x = element_line(colour = "red", size = 2),
    axis.line.y = element_line(colour = "orange", linetype = "dashed"),
    axis.text = element_text(color = "blue", size = 15, face = "italic"),
    axis.text.y = element_text(angle = 90, size = rel(0.7), hjust = 0),
    axis.ticks = element_line(colour = "violet"),
    axis.ticks.x = element_line(size = rel(2)),
    axis.title = element_text(size = 20, color = "orangered")
)

  • legend theme elements
Element Setter Description
legend.background element_rect() legend background
legend.key element_rect() background of legend keys
legend.key.size unit()
legend.key.height unit() legend key height
legend.key.width unit() legend key width
legend.margin unit() legend margin
legend.text element_text() legend labels
legend.text.align numeric legend label alignment
legend.title element_text() legend name
legend.title.align numeric legend name alignment
legend.position ‘left’, ‘right’, ‘bottom’ ‘, ’top’ position of legend
legend.direction numeric direction of legend keys
legend.justification numeric justification of legend
legend.box numeric position of multiple legend boxes

Let us see an example:

pl + 
  theme(
    legend.position = "top",
    legend.box = "horizontal",
    legend.background = element_rect(
      fill = "lemonchiffon",
      color = "black",
      size = 1,
      linetype = "longdash"
    ),
    legend.key = element_rect(fill = "lemonchiffon", color = "magenta"),
    legend.key.width = unit(0.5, "in"),
    legend.key.height = unit(0.2, "in"),
    legend.text = element_text(size = 8),
    legend.title = element_text(face = "bold", size = 10, colour = "magenta"))

  • panel theme elements
Element Setter Description
panel.background element_rect() background of graphics region
panel.border element_rect() border of graphics region
panel.grid.major element_line() major grid lines
panel.grid.major.x element_line() vertical major grid lines height
panel.grid.major.y element_line() horizontal major grid lines
panel.grid.minor element_line() minor grid lines
panel.grid.minor.x element_text() vertical minor grid lines
panel.grid.minor.y element_line horizontal minor grid lines
panel.margin numeric margin between facets
aspect.ratio numeric plot aspect ratio

Let us see an example:

pl + 
  theme(
    panel.background = element_rect(fill = "navy", color = "orange", size = 2),
    panel.border = element_rect(fill = NA, colour = "orange", size = 2),
    panel.grid.major = element_line(color = "gray60", size = 0.8),
    panel.grid.major.x = element_blank())

  • facet theme elements
Element Setter Description
strip.background element_rect() background of panel strips
strip.text element_text() strip text
strip.text.x element_text() horizontal strip text
strip.text.y element_text() vertical strip text

Let us see an example:

ggplot(data = bands, mapping = aes(x= humidity, y = viscosity)) +
  geom_point() + facet_grid(band_type ~ press_type) +
  theme(
    strip.background = element_rect(fill = "#4bb8b6", color = "#265665", size = 2),
    strip.text = element_text(face = "italic", size = 15, colour = "#CC1800"),
  strip.text.y = element_text(face = "bold")
  )

We have already learn about most of these ggplot2 theme elements in the previous chapters, in particular: axis theme elements, legend theme elements and facet theme elements. But what’s about plot theme elements and panels theme elements?

In the following paragraphs we will see how to handle with the most common questions on plot, panels and theme customization.

Add the title to a plot and change its appearance

As we saw in “Creating a Boxplot” chapter, title can be added by using ggtitle() or labs function in this way:

pl + 
  ggtitle("Scatterplot of humidity vs viscosity \n by Pressure type")
pl + 
  labs(title = "Scatterplot of humidity vs viscosity \n by Pressure type")

The previous two commands produces the same result:

"\n" is used to break the lines.

Title appearance can be changes by setting plot.title argument of theme() function:

pl + 
  ggtitle("Scatterplot of humidity vs viscosity \n by Pressure type") +
  theme(plot.title=element_text(size=rel(2), lineheight=.9, family="Times", face="bold.italic", colour="red"))

Modify the appearance of plotting area

If you want to change the appearance of plotting area you have to set panel_xxx arguments of theme() function.

Change the background of plotting area

panel.background element allows us to modify the background of graphical region and panel.border to modify the border of graphical region. Both panel.background and panel.border are modified by using element_rect() function.

pl + 
  theme(
    panel.background = element_rect(fill="lightblue"),
    panel.border = element_rect(colour="blue", fill=NA, size=2)
)

Pay attention to fill argument for panel.border: you have to specify a blank fill (setting it to NA) for not covering panels.

Customize and remove grid lines

panel.grid.major and panel.grid.minor arguments allows us to customize grid lines by using element_line() function:

pl + 
  theme(
    panel.grid.major = element_line(colour="red"),
    panel.grid.minor = element_line(colour="red", linetype="dashed", size=0.2)
)

If you want to remove grid lines you have to set panel.grid.major and/or panel.grid.minor equal to element_blank():

pl + 
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
)

It’s possible to hide just the vertical or horizontal grid lines with panel.grid.major.x, panel.grid.major.y, panel.grid.minor.x and panel.grid.minor.y arguments of theme() function:

pl + 
  theme(
    panel.grid.major.x = element_blank(), # remove horizontal grid major lines 
    panel.grid.minor.y = element_blank()  # remove vertical grid minor lines
)

Modify the plot appearance outside graphical area

If you want to modify plot appearance outside graphical area you have to set plot_xxx arguments of theme() function.

Change the background outside plotting area

To modify the background set plot.background element of theme() function:

pl + 
  theme(
    plot.background = element_rect(fill = "green")
)

Change the margins around graphical area

To modify margins around graphical area set plot.margin element of theme() function:

pl + 
  theme(
    plot.margin = unit(c(2,2,2,2), "cm")
)

unit() is a function that creates a grid unit object of the correct length to use for setting margins.

Change the default theme

To change theme function you have to overwrite the defaut one. Other theme functions are provided by ggplot2 and ggthemes packages.

ggplot2 provides these theme functions:

  • theme_bw(): a variation on theme_grey() that uses a white background and thin grey grid lines
  • theme_linedraw(): a theme with only black lines of various widths on white backgrounds, reminiscent of a line drawing
  • theme_light(): similar to theme_linedraw() but with ligth grey lines and axes, to direct more attention towards the data
  • theme_dark(): the dark cousin of theme_light(), with similar line sizes but a dark background. Useful to make thin coloured lines pop out:
  • theme_minimal(): a minimalistic theme with no background annotations
  • theme_classic(): A classic-looking theme, with x and y axis lines and no gridlines. In ggplot 2.1.0 axes are not visible because of a bug in the function
  • theme_void(): a completely empty theme
pl1 <- pl + theme_bw() + ggtitle("theme_bw()")
pl2 <- pl + theme_linedraw() + ggtitle("theme_linedraw()")
pl3 <- pl + theme_light() + ggtitle("theme_light()")
pl4 <- pl + theme_dark() + ggtitle("theme_dark()")
pl5 <- pl + theme_minimal() + ggtitle("theme_minimal()")
pl6 <- pl + theme_classic() + ggtitle("theme_classic()")
pl7 <- pl + theme_void() + ggtitle("theme_void()")

gridExtra::grid.arrange(pl1, pl2, pl3, pl4, pl5, pl6, pl7, ncol=2)

ggthemes package provides lots of theme functions. Let us see the most used:

pl8 <- pl + theme_tufte() + ggtitle("theme_tufte()") #  a minimal ink theme based on Tufte’s The Visual Display of Quantitative Information
pl9 <- pl + theme_solarized() + ggtitle("theme_solarized()") # a theme using the solarized color palette
pl10 <- pl + theme_excel() + ggtitle("theme_excel()") # a theme replicating the classic gray charts in Excel
pl11 <- pl + theme_few() + ggtitle("theme_few()") # theme from Stephen Few’s “Practical Rules for Using Color in Charts”
pl12 <- pl + theme_economist() + ggtitle("theme_economist()") # a theme based on the plots in the The Economist magazine
pl13 <- pl + theme_stata() + ggtitle("theme_stata()") # themes based on Stata graph schemes
pl14 <- pl + theme_wsj() + ggtitle("theme_wsj()") # a theme based on the plots in the The Wall Street Journa

gridExtra::grid.arrange(pl8, pl9, pl10, pl11, ncol=2)

pl12

pl13

pl14

Chang the default theme for more than one plot

If you want to change the default theme (theme_grey()) for all plots generated in the current R session, you can use theme_set() function. For example, if you want to use white background for all plots run:

theme_set(theme_bw())

Creating your own theme

You can create your own theme by adding elements to an existing theme:

mytheme <- theme_bw() +
theme(text = element_text(colour="red"),
      axis.title = element_text(size = rel(1.25)))
pl + 
  mytheme